Article Recommander
import pandas as pd
import numpy as np
%matplotlib inline
Loading data and preprocessing
we first learn the pickled article database. We will be cleaning it and separating the interesting articles from the uninteresting ones.
df = pd.read_pickle('./article.pkl')
del df["html"]
del df["image"]
del df["URL"]
del df["hash"]
del df["source"]
df["label"] = df["note"].apply(lambda x: 0 if x <= 0 else 1)
df.head(5)
| authors | note | resume | texte | titre | label | |
|---|---|---|---|---|---|---|
| 0 | [Danny Bradbury, Marco Santori, Adam Draper, M... | -10.0 | Black Market Reloaded, a black market site tha... | Black Market Reloaded, a black market site tha... | Black Market Reloaded back online after source... | 0 |
| 1 | [Emily Spaven, Stan Higgins, Emilyspaven] | 1.0 | The UK Home Office believes the government sho... | The UK Home Office believes the government sho... | Home Office: UK Should Create a Crime-Fighting... | 1 |
| 2 | [Pete Rizzo, Alex Batlin, Yessi Bello Perez, P... | -10.0 | Though lofty in its ideals, lead developer Dan... | A new social messaging app is aiming to disrup... | Gems Bitcoin App Lets Users Earn Money From So... | 0 |
| 3 | [Nermin Hajdarbegovic, Stan Higgins, Pete Rizz... | 3.0 | US satellite service provider DISH Network has... | US satellite service provider DISH Network has... | DISH Becomes World's Largest Company to Accept... | 1 |
| 4 | [Stan Higgins, Bailey Reutzel, Garrett Keirns,... | -10.0 | An unidentified 28-year-old man was robbed of ... | An unidentified 28-year-old man was robbed of ... | Bitcoin Stolen at Gunpoint in New York City Ro... | 0 |
Basic statistics on the dataset
let's explore the dataset and extract some numbers : * the number of article liked/disliked
df["label"].value_counts()
0 879
1 324
Name: label, dtype: int64
Create the full content column
df['full_content'] = df.titre + ' ' + df.resume #exclude the full texte of the article for the moment
df.head(1)
| authors | note | resume | texte | titre | label | full_content | |
|---|---|---|---|---|---|---|---|
| 0 | [Danny Bradbury, Marco Santori, Adam Draper, M... | -10.0 | Black Market Reloaded, a black market site tha... | Black Market Reloaded, a black market site tha... | Black Market Reloaded back online after source... | 0 | Black Market Reloaded back online after source... |
from sklearn.model_selection import train_test_split
training, testing = train_test_split(
df, # The dataset we want to split
train_size=0.75, # The proportional size of our training set
stratify=df.label, # The labels are used for stratification
random_state=400 # Use the same random state for reproducibility
)
training.head(5)
| authors | note | resume | texte | titre | label | full_content | |
|---|---|---|---|---|---|---|---|
| 748 | [Jon Brodkin] | -10.0 | Amazon, Reddit, Mozilla, and other Internet co... | Amazon, Reddit, Mozilla, and other Internet co... | Amazon and Reddit try to save net neutrality r... | 0 | Amazon and Reddit try to save net neutrality r... |
| 1183 | [Jon Brodkin] | -10.0 | (The Time Warner involved in this transaction ... | A group of mostly Democratic senators led by A... | Democrats urge Trump administration to block A... | 0 | Democrats urge Trump administration to block A... |
| 769 | [Joseph Brogan] | -10.0 | On Twitter, bad news comes at all hours, with ... | On Twitter, bad news comes at all hours, with ... | Some of the best art on Twitter comes from the... | 0 | Some of the best art on Twitter comes from the... |
| 57 | [Michael Del Castillo, Pete Rizzo, Trond Vidar... | -10.0 | Publicly traded online travel service Webjet i... | Publicly traded online travel service Webjet i... | Webjet Ethereum Pilot Targets Hotel Industry's... | 0 | Webjet Ethereum Pilot Targets Hotel Industry's... |
| 892 | [Andrew Cunningham] | 10.0 | What has changed on the 2017 MacBook, then?\nI... | Andrew Cunningham\n\nAndrew Cunningham\n\nAndr... | Mini-review: The 2017 MacBook could actually b... | 1 | Mini-review: The 2017 MacBook could actually b... |
from sklearn.feature_extraction.text import TfidfVectorizer, CountVectorizer
from sklearn.svm import LinearSVC, SVC
from sklearn.pipeline import Pipeline
from sklearn.model_selection import cross_val_predict
from utils.plotting import pipeline_performance
steps = (
('vectorizer', TfidfVectorizer()),
('classifier', LinearSVC())
)
pipeline = Pipeline(steps)
predicted_labels = cross_val_predict(pipeline, training.full_content, training.label)
pipeline_performance(training.label, predicted_labels)
pipeline = pipeline.fit(training.titre, training.label)
Accuracy = 80.6%
Confusion matrix, without normalization
[[624 35]
[140 103]]

import re
from utils.plotting import print_top_features
from sklearn.model_selection import GridSearchCV
def mask_integers(s):
return re.sub(r'\d+', 'INTMASK', s)
steps = (
('vectorizer', TfidfVectorizer()),
('classifier', LinearSVC())
)
pipeline = Pipeline(steps)
gs_params = {
#'vectorizer__use_idf': (True, False),
'vectorizer__lowercase': [True, False],
'vectorizer__stop_words': ['english', None],
'vectorizer__ngram_range': [(1, 1), (1, 2), (2, 2)],
'vectorizer__preprocessor': [mask_integers, None],
'classifier__C': np.linspace(5,20,25)
}
gs = GridSearchCV(pipeline, gs_params, n_jobs=1)
gs.fit(training.full_content, training.label)
print(gs.best_params_)
print(gs.best_score_)
pipeline1 = gs.best_estimator_
predicted_labels = pipeline1.predict(testing.full_content)
pipeline_performance(testing.label, predicted_labels)
print_top_features(pipeline1, n_features=10)
aaa = gs.predict(testing.full_content) == testing.label
aaa = aaa[testing.label == 1]
testing["titre"].iloc[~aaa.values]
#pipeline1.predict(["windows xbox bitcoin"])
from sklearn.externals import joblib
joblib.dump(pipeline1, 'classifier.pkl')
gs.predict(['Google'])
array([1], dtype=int64)
steps = (
('vectorizer', TfidfVectorizer()),
('classifier', SVC())
)
pipeline = Pipeline(steps)
gs_params = {
#'vectorizer__use_idf': (True, False),
'vectorizer__stop_words': ['english', None],
'vectorizer__ngram_range': [(1, 1), (1, 2), (2, 2)],
'vectorizer__preprocessor': [mask_integers, None],
'classifier__C': np.linspace(5,20,25)
}
gs = GridSearchCV(pipeline, gs_params, n_jobs=1)
gs.fit(training.full_content, training.label)
print(gs.best_params_)
print(gs.best_score_)
pipeline1 = gs.best_estimator_
predicted_labels = pipeline1.predict(testing.full_content)
pipeline_performance(testing.label, predicted_labels)
print_top_features(pipeline1, n_features=10)
{'classifier__C': 5.0, 'vectorizer__ngram_range': (1, 1), 'vectorizer__preprocessor': <function mask_integers at 0x00000237491B67B8>, 'vectorizer__stop_words': 'english'}
0.711180124224
Accuracy = 71.2%
Confusion matrix, without normalization
[[153 0]
[ 62 0]]
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-9-3e0781e307fb> in <module>()
25 pipeline_performance(testing.label, predicted_labels)
26
---> 27 print_top_features(pipeline1, n_features=10)
C:\Users\Guillaume\Documents\Code\recommandation\utils\plotting.py in print_top_features(pipeline, vectorizer_name, classifier_name, n_features)
81 def print_top_features(pipeline, vectorizer_name='vectorizer', classifier_name='classifier', n_features=7):
82 vocabulary = np.array(pipeline.named_steps[vectorizer_name].get_feature_names())
---> 83 coefs = pipeline.named_steps[classifier_name].coef_[0]
84 top_feature_idx = np.argsort(coefs)
85 top_features = vocabulary[top_feature_idx]
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\svm\base.py in coef_(self)
483 def coef_(self):
484 if self.kernel != 'linear':
--> 485 raise ValueError('coef_ is only available when using a '
486 'linear kernel')
487
ValueError: coef_ is only available when using a linear kernel

from sklearn.naive_bayes import BernoulliNB
steps = (
('vectorizer', TfidfVectorizer()),
('classifier', BernoulliNB())
)
pipeline2 = Pipeline(steps)
gs_params = {
'vectorizer__stop_words': ['english', None],
'vectorizer__ngram_range': [(1, 1), (1, 2), (2, 2)],
'vectorizer__preprocessor': [mask_integers, None],
'classifier__alpha': np.linspace(0,1,5),
'classifier__fit_prior': [True, False]
}
gs = GridSearchCV(pipeline2, gs_params, n_jobs=1)
gs.fit(training.full_content, training.label)
print(gs.best_params_)
print(gs.best_score_)
pipeline2 = gs.best_estimator_
predicted_labels = pipeline2.predict(testing.full_content)
pipeline_performance(testing.label, predicted_labels)
print_top_features(pipeline2, n_features=10)
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log
neg_prob = np.log(1 - np.exp(self.feature_log_prob_))
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add
jll += self.class_log_prior_ + neg_prob.sum(axis=1)
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log
neg_prob = np.log(1 - np.exp(self.feature_log_prob_))
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add
jll += self.class_log_prior_ + neg_prob.sum(axis=1)
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log
neg_prob = np.log(1 - np.exp(self.feature_log_prob_))
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add
jll += self.class_log_prior_ + neg_prob.sum(axis=1)
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log
neg_prob = np.log(1 - np.exp(self.feature_log_prob_))
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add
jll += self.class_log_prior_ + neg_prob.sum(axis=1)
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log
neg_prob = np.log(1 - np.exp(self.feature_log_prob_))
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add
jll += self.class_log_prior_ + neg_prob.sum(axis=1)
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log
neg_prob = np.log(1 - np.exp(self.feature_log_prob_))
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add
jll += self.class_log_prior_ + neg_prob.sum(axis=1)
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log
neg_prob = np.log(1 - np.exp(self.feature_log_prob_))
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add
jll += self.class_log_prior_ + neg_prob.sum(axis=1)
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log
neg_prob = np.log(1 - np.exp(self.feature_log_prob_))
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add
jll += self.class_log_prior_ + neg_prob.sum(axis=1)
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log
neg_prob = np.log(1 - np.exp(self.feature_log_prob_))
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add
jll += self.class_log_prior_ + neg_prob.sum(axis=1)
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log
neg_prob = np.log(1 - np.exp(self.feature_log_prob_))
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add
jll += self.class_log_prior_ + neg_prob.sum(axis=1)
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log
neg_prob = np.log(1 - np.exp(self.feature_log_prob_))
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add
jll += self.class_log_prior_ + neg_prob.sum(axis=1)
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log
neg_prob = np.log(1 - np.exp(self.feature_log_prob_))
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add
jll += self.class_log_prior_ + neg_prob.sum(axis=1)
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log
neg_prob = np.log(1 - np.exp(self.feature_log_prob_))
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add
jll += self.class_log_prior_ + neg_prob.sum(axis=1)
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log
neg_prob = np.log(1 - np.exp(self.feature_log_prob_))
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add
jll += self.class_log_prior_ + neg_prob.sum(axis=1)
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log
neg_prob = np.log(1 - np.exp(self.feature_log_prob_))
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add
jll += self.class_log_prior_ + neg_prob.sum(axis=1)
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log
neg_prob = np.log(1 - np.exp(self.feature_log_prob_))
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add
jll += self.class_log_prior_ + neg_prob.sum(axis=1)
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log
neg_prob = np.log(1 - np.exp(self.feature_log_prob_))
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add
jll += self.class_log_prior_ + neg_prob.sum(axis=1)
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log
neg_prob = np.log(1 - np.exp(self.feature_log_prob_))
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add
jll += self.class_log_prior_ + neg_prob.sum(axis=1)
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log
neg_prob = np.log(1 - np.exp(self.feature_log_prob_))
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add
jll += self.class_log_prior_ + neg_prob.sum(axis=1)
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log
neg_prob = np.log(1 - np.exp(self.feature_log_prob_))
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add
jll += self.class_log_prior_ + neg_prob.sum(axis=1)
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log
neg_prob = np.log(1 - np.exp(self.feature_log_prob_))
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add
jll += self.class_log_prior_ + neg_prob.sum(axis=1)
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log
neg_prob = np.log(1 - np.exp(self.feature_log_prob_))
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add
jll += self.class_log_prior_ + neg_prob.sum(axis=1)
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log
neg_prob = np.log(1 - np.exp(self.feature_log_prob_))
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add
jll += self.class_log_prior_ + neg_prob.sum(axis=1)
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log
neg_prob = np.log(1 - np.exp(self.feature_log_prob_))
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add
jll += self.class_log_prior_ + neg_prob.sum(axis=1)
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
{'classifier__alpha': 0.25, 'classifier__fit_prior': True, 'vectorizer__ngram_range': (1, 1), 'vectorizer__preprocessor': <function mask_integers at 0x00000237491B67B8>, 'vectorizer__stop_words': 'english'}
0.805900621118
Accuracy = 78.1%
Confusion matrix, without normalization
[[140 13]
[ 34 28]]
Top like features:
['use' 'just' 'year' 'price' 'time' 'Bitcoin' 'bitcoin' 'new' 'The'
'INTMASK']
---
Top dislike features:
['ABBA' 'cable' 'cab' 'byte' 'publication' 'bye' 'publications' 'publicity'
'buyer' 'publicizing']

from sklearn.naive_bayes import MultinomialNB
steps = (
('vectorizer', TfidfVectorizer()),
('classifier', MultinomialNB())
)
pipeline3 = Pipeline(steps)
gs_params = {
'vectorizer__stop_words': ['english', None],
'vectorizer__ngram_range': [(1, 1), (1, 2), (2, 2)],
'vectorizer__preprocessor': [mask_integers, None],
'classifier__alpha': np.linspace(0,1,5),
'classifier__fit_prior': [True, False]
}
gs = GridSearchCV(pipeline3, gs_params, n_jobs=1)
gs.fit(training.full_content, training.label)
print(gs.best_params_)
print(gs.best_score_)
pipeline3 = gs.best_estimator_
predicted_labels = pipeline3.predict(testing.full_content)
pipeline_performance(testing.label, predicted_labels)
print_top_features(pipeline3, n_features=10)
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
C:\Users\Guillaume\Anaconda3\lib\site-packages\sklearn\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log
self.feature_log_prob_ = (np.log(smoothed_fc) -
{'classifier__alpha': 0.5, 'classifier__fit_prior': False, 'vectorizer__ngram_range': (1, 1), 'vectorizer__preprocessor': <function mask_integers at 0x00000237491B67B8>, 'vectorizer__stop_words': 'english'}
0.80900621118
Accuracy = 79.1%
Confusion matrix, without normalization
[[141 12]
[ 33 29]]
Top like features:
['time' 'Google' 'Pro' 'Apple' 'new' 'The' 'Bitcoin' 'price' 'bitcoin'
'INTMASK']
---
Top dislike features:
['ABBA' 'categories' 'catching' 'catalyst' 'catalog' 'casually' 'casts'
'cast' 'cashier' 'ran']
